# Faculty of Science FINAL EXAMINATION

# COMPUTER SCIENCE COMP 273 INTRODUCTION TO COMPUTER SYSTEMS

Examiner: Prof. Michael Langer April 18, 2012 Associate Examiner: Mr. Joseph Vybihal 2 P.M. – 5 P.M.

| STUDENT NAME: | ID: |  |
|---------------|-----|--|
|               |     |  |

### **Instructions:**

The exam is 16 pages, including this cover page.

There are 11 a Assignment 5 Project Exam Help

Answer all questions on the ex

The 16 pages include two types://eduassistpro.github.io/

No calculators, notes, or books are allowed Add WeChat edu\_assist\_pro

| Question | Points | Grade |
|----------|--------|-------|
| 1        | 12     |       |
| 2        | 3      |       |
| 3        | 4      |       |
| 4        | 5      |       |
| 5        | 4      |       |
| 6        | 4      |       |
| 7        | 3      |       |
| 8        | 3      |       |
| 9        | 4      |       |
| 10       | 4      |       |
| 11       | 4      |       |
| Total    | 50     |       |

#### 1. **(12 points)**

For each of the three MIPS instructions below:

- specify as much of the machine code of the instruction as you can, showing how many bits are in each field and leaving unknown fields blank;
- draw the datapath and controls for a *single cycle* implementation of the instruction; only include parts of the datapath that are used in the instruction; specify the bit width of any lines you draw in the datapath, and write any known values on the lines.

```
(a) addi $15, $0, -8  # add immediate

(b) sltu $16, $15, $0  # set less than (unsigned)

(c) jalr $16, $15  # special version of jump and link
```

The special version of the jump-and-link instruction, jalr rs rd, jumps to the address in register rs and puts the return address in register rd.

Assignment Project Exam Help

https://eduassistpro.github.io/ Add WeChat edu\_assist\_pro

# **SOLUTION:**

Question 1a (7 points) 1 point for each of the following:

- I-format partition with the appropriately labelled fields of each.
- The values 15 and 0 should be given in binary and some indication given of which fields they go into.
- fetch stage: PC register to Memory (instructions) and instruction read out and decomposed into fields.
- $PC \leftarrow PC + 4$
- lines into and out of registers should be labelled: indicate you know which instruction fields correspond to which lines.
- sign extend the immediate argument from 16 to 32 bits
- the value read from the register is fed to the ALU, along with the sign extended value; result is written into a register

Assignment Project Exam Help





#### **SOLUTION:**

# Question 1b (2 points)

The datapath here is similar to 1a so we only give a few more points here.

- sltu is R format so the partition of the instruction into fields is different. You need to know how many bits are in each field. (You should know the opcode is 000000 for R format, but we didn't require this.)
  - The rs, rt, rd fields are 15 (011111), 0 (00000), 16 (10000). The bit codes should be written here and put in correct position.
- Otherwise, it is the same datapath as the add instruction from the lecture notes. It reads two registers, sends the two 32 bit values into the ALU. The result is then written back to the registers.

I am expecting some labelling of the wires in the registers to indicate that the student knows that the rd register is the one we write to and the rs and rt are the ones we read from

Some students tried to write up some special circuitry to compute the value from the ALU, but this is not necessar

# https://eduassistpro.github.io/





#### **SOLUTION:**

# Question 1c (3 points)

The datapath for the jalr instruction is a combination of jr and jal datapaths which I sketched out in class.

- You might use an I format or an R format for this. (The correct answer is R format but I don't expect you to know that.)
  - If you assumed I format, then the rs register (source) register 15 (01111) is read out and fed into the PC, and the destination register would be rt and would have the value 16 (10000). This is where PC + 4 is written.
  - If you assumed R format, then the rs register (source) register 15 (01111) is read out and fed into the PC, and the destination register would be rd and would have the value 16 (10000). This is where PC + 4 is written.
- the PC is updated with the 32 bit value from the source register. Some students might think that only the low order 28 bits are used as in the jump instruction (j) but this is inconcerning inconcerning.
- the PC+4 value is written into a register so there should be a line from the PC to the register. (They in there, and wttps://eduassistpro.github.io/

# Add WeChat edu\_assist\_pro





Convert each of the MIPS pseudoinstructions into real MIPS instructions that perform the equivalent operation.

*Hint:* For (b), you are allowed to use up to three instructions. If you need a temporary register, use \$1.

- (a) move \$s0, \$s1
- (b) seq \$s0, \$s1, \$s2 # set if equal (similar to slt)

#### **SOLUTION:**

#### (a) **(1 point)**

I was expecting a one instruction solutions such as:

# Assigninent Project Exam Help

#### Grading sch

addi \$shttps://eduassistpro.github.io/Give only 0.5 for that.

# Add WeChat edu\_assist\_pro

#### (b) **(2 points)**

Another clever solution used xor for the first line, instead:

```
xor $s0, $s1, $s2 # gives $s0 = 0 if and only if $s1=$s2
```

**Grading scheme:** Gave 1 point if the student performed a subtraction in the first step but otherwise the solution didn't make sense.

For a processor that uses a microinstruction implementation:

- (a) What is the microprogram counter, and how does it differ from the program counter?
- (b) In general, what are the bits in a microinstruction? How do they differ from the bits in a (regular machine code) instruction?
- (c) How does the processor decide which instruction to execute next?
- (d) How does the processor decide which microinstruction to execute next?

#### **SOLUTION:**

- (a) Each machine code instruction is broken into a sequence of microinstructions. A microprogram counter keeps track of which microinstruction we are executing (within the machine code instruction).
- (b) The bits is a light that microinstruction e.g. ALUOp, WriteReg, ReadMem, etc. This is different from the bits in an actual instruction of the bits in act
- fileds (for R forttps://eduassistpro.github.io/
  (c) The input line to the possibilities. In the lecture notes, the selector was PCSr on address is PCA4 pai offer a safflitiest existing fassist\_pro branch instructions, and for exceptions.
- (d) The default for the next microinstruction is microPC+1. For the scheme I developed in class, there two cases in which microPC+1 is *not* the next microinstruction to be executed: first, when a microinstruction sequence finishes, and second, after the fetch stage. Which of these three cases is occurring is coded by control signals, and so the next microinstruction is selected using these controls. For the example in class, the control signals were Next, Fetch.

**Grading scheme:** 1 point for each of the above. For (b), if you just talked about fetch and next as if they were the only bits in the microinstruction, you got only 0.5. For (c) if you only mentioned PC+4, you got 0.5. For (d), if you just wrote "fetch and next" and hardly explained, you got 0.5.

# 4. (4 points – was originally 5)

Identify the hazards in the the following if-then-else sequence. Describe how to resolve these hazards using as few nop instructions as possible.

```
if:
        slt
              $t0, $s0, $s1
              $t0, $0,
                         else
        bne
        lw
              $s3, 0($sp)
        add
              $s3, $s3, $s4
               endif
         j
else:
        sub
              $s3, $s3, $s4
endif:
              $t0, $0,
                         $0
        and
              $t1, $0,
                         $0
        or
```

#### **SOLUTION:**

|    |    | $\Delta$ e.c. | ianı | nen  | t P | roject | Exam | Heln  |
|----|----|---------------|------|------|-----|--------|------|-------|
| IF | ID | APU           | MEM  | MB11 | ιı  | TOJCCL | LAam | Ticip |
|    | IF | ID            | ALU  | MEM  | WB  | _      |      |       |

There are four (pipelin nttps://eduassistpro.github.io/

- (a) (data hazard) The r eeded by the bne instruction. The value is known after the AL lt and must forwarded to the ALU stage of the vac instruction.
- (b) (control hazard) The lw and add instructi bne is decoded and goes through the ALU, respectively. The lw and add instructions should only be executed if bne condition is not met. If the branch condition is met, then the WriteReg control of the lw and add instructions should be set to 0. They will still go through the pipeline but they won't write back.
- (c) (data hazard) The lw instruction gets a word from Memory but this only happens at the end of the MEM stage. This value is needed at the ALU stage of the add instruction. Thus, a nop needs to be inserted between lw and add.
  - Some students may be tempted to move the or instruction in there, instead of nop. But this is incorrect since add is conditioned on the branch, but or is unconditional.
- (d) (control hazard) The **sub** enters the pipeline after j but it shouldn't be executed. So we (re)set its RegWrite control so it doesn't write in the WB stage.

#### Grading scheme:

Give 1 point for identifying each hazard and showing how to resolve each. (0.5 for identification and 0.5 for solution)

Assume a computer has a main memory with 64 MB and a data cache with 256 entries of 4 words each. Assume the cache is accessed using direct mapping.

- (a) How many KB (kilobytes) of data are stored in the cache?
- (b) How many bits are in the tag field of each line of the cache?
- (c) Suppose a program makes a sequence of five memory accesses in the order shown below. The numbers are physical byte addresses in main memory.

0x1a24380 0x211f0d0 0x211f388 0x212f12c 0x1c8fad0

What are the indices of the cache entries that are used in these memory accesses? Give you Asser in hexadecimal Project Exam Help (d) What are the contents of the tag fields of these entries after all five memory accesses

have been comp

# https://eduassistpro.github.io/ SOLUTION:

- (a)  $256 \times 8 = 4096$  bytes, or 4 KB(a)  $250 \times 6 = 4090$  by respect of 4.88 (b) 64 MB requires of 4.88 of 4.88 chat we call 4.88 characteristics 4.88 chat we call 4.88 characteristics 4.88 chat we call 4.88 characteristics 4.88 cha
- bit "word offset", a tag, and an 8 bit cache index (256 entries in the cache). The tag is thus 14 bits (26 - 2 - 2 - 8 = 14).
- (c) The byte offset and word offset use bits 0,1 and 2,3 respectively. Thus, the cache index is determined by the next 8 bits (4-11), or equivalently, by the hex digits 1 and 2 (where least significant is hex digit 0). The cache indices are 0x38, 0x0d, 0x38, 0x12, 0xad. Note that two of these (0x38) are the same and so there are only four different cache lines that are accessed.
- (d) The tag field is upper part of the address, beyond the cache index. (It is 4 hex digits.) The contents of the tag fields after all five accesses are:

| index | tag  |            |           |      |       |     |              |
|-------|------|------------|-----------|------|-------|-----|--------------|
| 38    | 211f | (note that | the first | tag, | 1a24, | was | overwritten) |
| 0d    | 211f |            |           |      |       |     |              |
| 12    | 212f |            |           |      |       |     |              |
| ad    | 1c8f |            |           |      |       |     |              |

Grading scheme: if you gave the byte index in (c), we gave you the point. But if you did that they you gave five tags for (d) which was incorrect so we only gave 0.5 points for (d) in that case.

Consider a cache/TLB design in which the cache is accessed after the virtual page number has been translated to a physical page number (as discussed in class). With this design, the CPU can only fetch an instruction if the address translation is in the TLB and the instruction is in the cache. Thus, the best case scenario for fetching an instruction is a TLB hit followed by a cache hit. Outline the steps that must be taken in the *worst case* scenario, namely the instruction is on the hard disk. (You may assume that the entire page table is in main memory.)

#### **SOLUTION:**

Given a virtual address,

- check the TLB for translation (TLB miss)
- check page table entry (not valid -> page fault!)
- copy page from hard disk to main memory
- update page.table
- copy Assignment Project Exam Help
- translate virtual to physical using new TLB entry
- check cache (miss!
- copy block for the copy block

Grading scheme: I tried to give 0.5 points for each of the nine steps, a 4 points. However this goding scheme flat did Cook side SSIST\_Pro e order and used the wrong terminology. A typical mistake wa cache miss, or to have two page faults, or to say things like "move the page into the cache".

If there were lots of mistakes, then I fell back on the following: 1/4 for mostly nonsense, 2/4 for missing several important steps but otherwise correct, 3/4 for logic correct but details and a few steps missing.

- (a) Briefly describe the difference between isolated I/O and memory mapped I/O. Which does MIPS use? Explain.
- (b) What are the advantages of memory mapped I/O?

#### **SOLUTION:**

- (a) Isolated I/O has separate assembly language instructions for different I/O devices. Memory mapped I/O treats addresses in I/O devices as if they are part of the virtual address space.
  - MIPS uses memory mapped I/O. It uses the upper 64 KB of virtual address space i.e. addresses <code>0xffff\_\_\_\_</code>.
- (b) The advantage of memory mapped I/O is that we can reduce the number of assembly language instructions for each I/O legister specified in the assembly language, the assembly language program can just load and s

  translate the mapping instructions of translate the mapping instructions of the advantage of the same o

Grading scheme: 2 Aoints or West hat edu\_assist\_pro

- (a) Describe the *polling* method for I/O, and how it uses the system bus.
- (b) Argue why polling is more suitable than interrupts for mouse input.
- (c) How often do you think the mouse should be polled?

#### **SOLUTION:**

1 point for each of the following:

- (a) In polling, the CPU repeatedly and regularly asks an I/O device whether it is ready to input/output data. The CPU puts the number of the I/O device on the address bus and values on the data and control lines which specify relevant information. The I/O device then uses the bus to respond.
- (b) Polling is simpler since it can be done between processes and so there is no extra overhead. (Interrupts require saving the state of a process.)
  - Moving spige restrementy Proposition than which while departe many interrupts.
- (c) It is sufficient to poll at t second.

https://eduassistpro.github.io/

Grading scheme: 1 poi

Add WeChat edu\_assist\_pro

# 9. (3 points – originally was 4)

- (a) What is direct memory access (DMA)?
- (b) Describe how DMA could be used to record a video signal that is measured with a laptop's camera. Give sufficient detail to show that you understand DMA.

#### **SOLUTION:**

(a) DMA is used for transfering data between an I/O device and main memory, without the CPU being used.

With DMA, an I/O device is given control of the system bus, so that the CPU can continue processing instructions (of another process, by using the cache only. With DMA, the CPU tells the I/O device once that it wants to read or write to be done. The I/O device then administers the transfer of data to or from main memory on its own.

(b) Application of the system bus to the video capture I/O device.

The video captu uld buffer the sequettps://eduassistpro.github.io/  $_{\rm A\ would}^{\rm his}$  would be hundred frames into main memory (

Grading scheme: Add WeChat edu\_assist\_pro

- (a) The first step that occurs in an interrupt is that the CPU receives a IRQ=1 signal. Describe the steps that the CPU takes after receiving this signal, assuming it allows the interrupt. (In this part of the question, you may assume that there is a separate pair of IRQ/IACK lines for each I/O device.)
- (b) Describe the daisy chaining method for interrupts.

#### **SOLUTION:**

- (a) After receiving the interrupt request, IRQ = 1, the CPU decides whether or not to grant it. If yes, then the interrupt handler:
  - saves the state of the current process (all the registers).
  - sends an IACK signal and gets whatever info it needs to process the interrupt.
- (b) Ausschaften Chille the CPU, rather than having the CPU deci
  - from class ttps://eduassistpro.github.io/ d IACK
    To make t ttps://eduassistpro.github.io/ d IACK
    going to/from CPU. IRQ is ON if and only if at least on
    interrupt. This is done with the big OR gate du\_assist\_pro
    Add WeChat edu\_assist\_pro

# 11. (3 points – originally was 4)

- (a) The transfer of data from the computer to a printer does not use a clock. Why not? Describe a source initiated *handshaking* method that is used when the printer is connected by a *parallel port*. Draw a timing diagram to illustrate.
- (b) Suppose the one-byte ASCII values, 128 and 142, are sent as part of a message on a serial bus. Give a timing diagram of the binary values that would be measured when reading the bus. Explain values for all bits in your timing diagram.

#### **SOLUTION:**

- The reason a computer and printer do not communicate using a clock is that the physical distance between them is too large to make a fast clock feasible. With a clock-based bus, the signal has to stabilize over the entire bus before the end of the clock pulse. For the case of a printer, the cable is of indeterminate length (1 m, 3 ASSISTIMENT PROJECT Exam Help Moreover, the amount of data that is sent per second is much less than with, say, an interne
  - I was just https://eduassistpro.github.io/
    source initiated or destination initiated was fine
- (b) The ASCII values 123 and 142 are 10000000 chd 1000150 siest protesses, you need to have a start bit and a st Coulomb as the process of the pulse width of the bits that are being sent and that the receiving device samples from the center of each bit. i.e. each has length T seconds and it samples at T/2 from the start of each bit.